Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hi Tom
    yes I'm using Stata 12 (actually 12.1). Is it a problem? Can't the estimate with ppmlhdfe be done with Stata 12? If it cannot be done, is there any other way to circumvent the problem? I mean:
    1) is there any other way to use the ppmlhdfe command with Stata 12? or
    2) is there any other way to use the ppml command to do the same estimate that can be done with the ppmlhdfe command?

    Furthermore I add another question. I know (Santos Silva has written this in many posts in this list) that the ppml command requires that the dependent variable (be it trade, migration, FDI and so on) should NOT be logged. But what about the ppmlhdfe command? Is the raccomandation of Santos Silva also valid for the ppmlhdfe command?

    Regards
    Romano

    Comment


    • #17
      Dear Tom Zylkin,

      I am following your instructions on how to use ppml and ppmlhdfe command on this thread and many other threads. I am currently working with my sample data which consists of three countries with multilateral trade (from 1998 to 2018) and I only include exporter GDP, importer GDP, and distance as my independent variables and I want to figure out how these variables affect export volume.

      After reading this thread, to my understanding, the ppml command
      Code:
      ppml lnvol lndist lnGDPimp lnGDPexp exporter_* importer_* year_*
      and ppmlhdfe command
      Code:
      ppmlhdfe lnvol lndist lnGDPimp lnGDPexp, a(importer exporter year) vce(robust)
      should give me the same results. However this is not the case.

      Code:
      . ppml lnvol lndist lnGDPimp lnGDPexp exporter_* importer_* year_*
      
      note: checking the existence of the estimates
      
      Number of regressors excluded to ensure that the estimates exist: 4
      Excluded regressors:  exporter_1 exporter_2 importer_2 year_21
      Number of observations excluded: 0
      
      note: starting ppml estimation
      note: lnvol has noninteger values
      
      Iteration 1:   deviance =  4.493896
      Iteration 2:   deviance =  4.420987
      Iteration 3:   deviance =  4.420984
      Iteration 4:   deviance =  4.420984
      
      Number of parameters: 27
      Number of observations: 126
      Pseudo log-likelihood: -260.46453
      R-squared: .92118197
      Option strict is: off
      ------------------------------------------------------------------------------
                   |               Robust
             lnvol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            lndist |  -.1561562   .0309894    -5.04   0.000    -.2168942   -.0954182
          lnGDPimp |    .109938    .041569     2.64   0.008     .0284643    .1914116
          lnGDPexp |   .0555364   .0383281     1.45   0.147    -.0195854    .1306582
        exporter_3 |  -.1191625   .1528654    -0.78   0.436    -.4187731    .1804482
        importer_1 |  -.0974729   .0498772    -1.95   0.051    -.1952305    .0002847
        importer_3 |   .0270576    .201956     0.13   0.893    -.3687689    .4228841
            year_1 |  -.2258452   .0935905    -2.41   0.016    -.4092792   -.0424112
            year_2 |  -.2105067   .0929665    -2.26   0.024    -.3927178   -.0282957
            year_3 |   -.169045   .0887337    -1.91   0.057    -.3429598    .0048697
            year_4 |  -.1605192   .0825138    -1.95   0.052    -.3222432    .0012048
            year_5 |  -.1382407   .0744673    -1.86   0.063     -.284194    .0077125
            year_6 |   -.098291   .0650511    -1.51   0.131    -.2257887    .0292068
            year_7 |  -.0801831    .061846    -1.30   0.195    -.2013991    .0410329
            year_8 |  -.0822553   .0601515    -1.37   0.171    -.2001501    .0356395
            year_9 |  -.0772801   .0564199    -1.37   0.171     -.187861    .0333008
           year_10 |  -.0575561   .0509048    -1.13   0.258    -.1573278    .0422155
           year_11 |  -.0422892   .0481685    -0.88   0.380    -.1366976    .0521193
           year_12 |  -.0486424    .046542    -1.05   0.296    -.1398631    .0425783
           year_13 |  -.0343536   .0433686    -0.79   0.428    -.1193546    .0506473
           year_14 |  -.0248547    .041068    -0.61   0.545    -.1053465    .0556371
           year_15 |  -.0188818    .039707    -0.48   0.634    -.0967062    .0589426
           year_16 |  -.0153306   .0391906    -0.39   0.696    -.0921427    .0614815
           year_17 |  -.0105907   .0387608    -0.27   0.785    -.0865605    .0653791
           year_18 |  -.0085547   .0389603    -0.22   0.826    -.0849155    .0678061
           year_19 |  -.0075817   .0398676    -0.19   0.849    -.0857207    .0705573
           year_20 |  -.0030358   .0400476    -0.08   0.940    -.0815278    .0754561
             _cons |   2.462663   .4698884     5.24   0.000     1.541698    3.383627
      ------------------------------------------------------------------------------
      Code:
      . ppmlhdfe lnvol lndist lnGDPimp lnGDPexp, a(importer exporter year) vce(robust)
      note: 1 variable omitted because of collinearity: lndist
      Iteration 1:   deviance = 4.4939e+00  eps = .         iters = 4    tol = 1.0e-04
      >   min(eta) =   1.10  P   
      Iteration 2:   deviance = 4.4210e+00  eps = 1.65e-02  iters = 4    tol = 1.0e-04
      >   min(eta) =   1.09      
      Iteration 3:   deviance = 4.4210e+00  eps = 6.85e-07  iters = 3    tol = 1.0e-04
      >   min(eta) =   1.09      
      Iteration 4:   deviance = 4.4210e+00  eps = 3.93e-15  iters = 2    tol = 1.0e-05
      >   min(eta) =   1.09      
      Iteration 5:   deviance = 4.4210e+00  eps = 1.04e-15  iters = 2    tol = 1.0e-08
      >   min(eta) =   1.09   S O
      --------------------------------------------------------------------------------
      > ----------------------------
      (legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon b
      > elow tolerance)
      Converged in 5 iterations and 15 HDFE sub-iterations (tol = 1.0e-08)
      
      HDFE PPML regression                              No. of obs      =        126
      Absorbing 3 HDFE groups                           Residual df     =         99
                                                        Wald chi2(2)    =       7.52
      Deviance             =  4.420984129               Prob > chi2     =     0.0233
      Log pseudolikelihood = -260.4645284               Pseudo R2       =     0.0909
      ------------------------------------------------------------------------------
                   |               Robust
             lnvol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            lndist |          0  (omitted)
          lnGDPimp |    .109938    .041569     2.64   0.008     .0284643    .1914116
          lnGDPexp |   .0555364   .0383281     1.45   0.147    -.0195854    .1306582
             _cons |   1.002081   .4973192     2.01   0.044     .0273532    1.976809
      ------------------------------------------------------------------------------
      
      Absorbed degrees of freedom:
      -----------------------------------------------------+
       Absorbed FE | Categories  - Redundant  = Num. Coefs |
      -------------+---------------------------------------|
          importer |         3           0           3     |
          exporter |         3           1           2     |
              year |        21           1          20    ?|
      -----------------------------------------------------+
      ? = number of redundant parameters may be higher
      Essentially, as I said above, I want to figure out how GDP and distance affect export volume. Can you help me explain what was going wrong with my data and give me any recommendations?

      Comment


      • #18
        Hi Long Nguyenn,

        I believe if you move your fixed effects to the left in your use of the ppml command you will get the same result. Try the following:

        Code:
        ppml exporter_* importer_* year_* lnvol lndist lnGDPimp lnGDPexp
        Regards,
        Tom


        Originally posted by Long Nguyenn View Post
        Dear Tom Zylkin,

        I am following your instructions on how to use ppml and ppmlhdfe command on this thread and many other threads. I am currently working with my sample data which consists of three countries with multilateral trade (from 1998 to 2018) and I only include exporter GDP, importer GDP, and distance as my independent variables and I want to figure out how these variables affect export volume.

        After reading this thread, to my understanding, the ppml command
        Code:
        ppml lnvol lndist lnGDPimp lnGDPexp exporter_* importer_* year_*
        and ppmlhdfe command
        Code:
        ppmlhdfe lnvol lndist lnGDPimp lnGDPexp, a(importer exporter year) vce(robust)
        should give me the same results. However this is not the case.

        Code:
        . ppml lnvol lndist lnGDPimp lnGDPexp exporter_* importer_* year_*
        
        note: checking the existence of the estimates
        
        Number of regressors excluded to ensure that the estimates exist: 4
        Excluded regressors: exporter_1 exporter_2 importer_2 year_21
        Number of observations excluded: 0
        
        note: starting ppml estimation
        note: lnvol has noninteger values
        
        Iteration 1: deviance = 4.493896
        Iteration 2: deviance = 4.420987
        Iteration 3: deviance = 4.420984
        Iteration 4: deviance = 4.420984
        
        Number of parameters: 27
        Number of observations: 126
        Pseudo log-likelihood: -260.46453
        R-squared: .92118197
        Option strict is: off
        ------------------------------------------------------------------------------
        | Robust
        lnvol | Coef. Std. Err. z P>|z| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        lndist | -.1561562 .0309894 -5.04 0.000 -.2168942 -.0954182
        lnGDPimp | .109938 .041569 2.64 0.008 .0284643 .1914116
        lnGDPexp | .0555364 .0383281 1.45 0.147 -.0195854 .1306582
        exporter_3 | -.1191625 .1528654 -0.78 0.436 -.4187731 .1804482
        importer_1 | -.0974729 .0498772 -1.95 0.051 -.1952305 .0002847
        importer_3 | .0270576 .201956 0.13 0.893 -.3687689 .4228841
        year_1 | -.2258452 .0935905 -2.41 0.016 -.4092792 -.0424112
        year_2 | -.2105067 .0929665 -2.26 0.024 -.3927178 -.0282957
        year_3 | -.169045 .0887337 -1.91 0.057 -.3429598 .0048697
        year_4 | -.1605192 .0825138 -1.95 0.052 -.3222432 .0012048
        year_5 | -.1382407 .0744673 -1.86 0.063 -.284194 .0077125
        year_6 | -.098291 .0650511 -1.51 0.131 -.2257887 .0292068
        year_7 | -.0801831 .061846 -1.30 0.195 -.2013991 .0410329
        year_8 | -.0822553 .0601515 -1.37 0.171 -.2001501 .0356395
        year_9 | -.0772801 .0564199 -1.37 0.171 -.187861 .0333008
        year_10 | -.0575561 .0509048 -1.13 0.258 -.1573278 .0422155
        year_11 | -.0422892 .0481685 -0.88 0.380 -.1366976 .0521193
        year_12 | -.0486424 .046542 -1.05 0.296 -.1398631 .0425783
        year_13 | -.0343536 .0433686 -0.79 0.428 -.1193546 .0506473
        year_14 | -.0248547 .041068 -0.61 0.545 -.1053465 .0556371
        year_15 | -.0188818 .039707 -0.48 0.634 -.0967062 .0589426
        year_16 | -.0153306 .0391906 -0.39 0.696 -.0921427 .0614815
        year_17 | -.0105907 .0387608 -0.27 0.785 -.0865605 .0653791
        year_18 | -.0085547 .0389603 -0.22 0.826 -.0849155 .0678061
        year_19 | -.0075817 .0398676 -0.19 0.849 -.0857207 .0705573
        year_20 | -.0030358 .0400476 -0.08 0.940 -.0815278 .0754561
        _cons | 2.462663 .4698884 5.24 0.000 1.541698 3.383627
        ------------------------------------------------------------------------------
        Code:
        . ppmlhdfe lnvol lndist lnGDPimp lnGDPexp, a(importer exporter year) vce(robust)
        note: 1 variable omitted because of collinearity: lndist
        Iteration 1: deviance = 4.4939e+00 eps = . iters = 4 tol = 1.0e-04
        > min(eta) = 1.10 P
        Iteration 2: deviance = 4.4210e+00 eps = 1.65e-02 iters = 4 tol = 1.0e-04
        > min(eta) = 1.09
        Iteration 3: deviance = 4.4210e+00 eps = 6.85e-07 iters = 3 tol = 1.0e-04
        > min(eta) = 1.09
        Iteration 4: deviance = 4.4210e+00 eps = 3.93e-15 iters = 2 tol = 1.0e-05
        > min(eta) = 1.09
        Iteration 5: deviance = 4.4210e+00 eps = 1.04e-15 iters = 2 tol = 1.0e-08
        > min(eta) = 1.09 S O
        --------------------------------------------------------------------------------
        > ----------------------------
        (legend: p: exact partial-out s: exact solver h: step-halving o: epsilon b
        > elow tolerance)
        Converged in 5 iterations and 15 HDFE sub-iterations (tol = 1.0e-08)
        
        HDFE PPML regression No. of obs = 126
        Absorbing 3 HDFE groups Residual df = 99
        Wald chi2(2) = 7.52
        Deviance = 4.420984129 Prob > chi2 = 0.0233
        Log pseudolikelihood = -260.4645284 Pseudo R2 = 0.0909
        ------------------------------------------------------------------------------
        | Robust
        lnvol | Coef. Std. Err. z P>|z| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        lndist | 0 (omitted)
        lnGDPimp | .109938 .041569 2.64 0.008 .0284643 .1914116
        lnGDPexp | .0555364 .0383281 1.45 0.147 -.0195854 .1306582
        _cons | 1.002081 .4973192 2.01 0.044 .0273532 1.976809
        ------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        -----------------------------------------------------+
        Absorbed FE | Categories - Redundant = Num. Coefs |
        -------------+---------------------------------------|
        importer | 3 0 3 |
        exporter | 3 1 2 |
        year | 21 1 20 ?|
        -----------------------------------------------------+
        ? = number of redundant parameters may be higher
        Essentially, as I said above, I want to figure out how GDP and distance affect export volume. Can you help me explain what was going wrong with my data and give me any recommendations?

        Comment


        • #19
          Dear Tom Zylkin,

          Thank you very much for your fast reply. However, I follow your code and it did not work.

          Code:
          ppml exporter_* importer_* year_* lnvol lndist lnGDPimp lnGDPexp
          Code:
          note: checking the existence of the estimates
          
          Number of regressors excluded to ensure that the estimates exist: 3
          Excluded regressors:  exporter_2 exporter_3 importer_1
          Number of observations excluded: 84
          
          note: importer_2 omitted because of collinearity
          note: importer_3 omitted because of collinearity
          note: year_1 omitted because of collinearity
          note: year_21 omitted because of collinearity
          
          note: starting ppml estimation
          
          Iteration 1:   deviance =         0
          Iteration 2:   deviance =         0
          
          Number of parameters: 24
          Number of observations: 42
          Pseudo log-likelihood: -42
          R-squared: .
          Option strict is: off
          ------------------------------------------------------------------------------
                       |               Robust
            exporter_1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                year_2 |          0  (omitted)
                year_3 |          0  (omitted)
                year_4 |          0  (omitted)
                year_5 |          0  (omitted)
                year_6 |          0  (omitted)
                year_7 |          0  (omitted)
                year_8 |          0  (omitted)
                year_9 |          0  (omitted)
               year_10 |          0  (omitted)
               year_11 |          0  (omitted)
               year_12 |          0  (omitted)
               year_13 |          0  (omitted)
               year_14 |          0  (omitted)
               year_15 |          0  (omitted)
               year_16 |          0  (omitted)
               year_17 |          0  (omitted)
               year_18 |          0  (omitted)
               year_19 |          0  (omitted)
               year_20 |          0  (omitted)
                 lnvol |          0  (omitted)
                lndist |          0  (omitted)
              lnGDPimp |          0  (omitted)
              lnGDPexp |          0  (omitted)
                 _cons |          0  (omitted)
          ------------------------------------------------------------------------------
          This result in it trying to predict export dummy variable. I am assuming you might have missed the trade volume (lnvol) as the dependent variable in your recommendation so I tried the following

          Code:
          . ppml lnvol exporter_* importer_* year_* lndist lnGDPimp lnGDPexp
          :

          The result from this is exactly the same as putting those fixed effects dummy variables to the back of the code.

          Code:
          ------------------------------------------------------------------------------
                       |               Robust
                 lnvol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
            exporter_3 |  -.1191625   .1528654    -0.78   0.436    -.4187731    .1804482
            importer_1 |  -.0974729   .0498772    -1.95   0.051    -.1952305    .0002847
            importer_3 |   .0270576    .201956     0.13   0.893    -.3687689    .4228841
                year_1 |  -.2258452   .0935905    -2.41   0.016    -.4092792   -.0424112
                year_2 |  -.2105067   .0929665    -2.26   0.024    -.3927178   -.0282957
                year_3 |   -.169045   .0887337    -1.91   0.057    -.3429598    .0048697
                year_4 |  -.1605192   .0825138    -1.95   0.052    -.3222432    .0012048
                year_5 |  -.1382407   .0744673    -1.86   0.063     -.284194    .0077125
                year_6 |   -.098291   .0650511    -1.51   0.131    -.2257887    .0292068
                year_7 |  -.0801831    .061846    -1.30   0.195    -.2013991    .0410329
                year_8 |  -.0822553   .0601515    -1.37   0.171    -.2001501    .0356395
                year_9 |  -.0772801   .0564199    -1.37   0.171     -.187861    .0333008
               year_10 |  -.0575561   .0509048    -1.13   0.258    -.1573278    .0422155
               year_11 |  -.0422892   .0481685    -0.88   0.380    -.1366976    .0521193
               year_12 |  -.0486424    .046542    -1.05   0.296    -.1398631    .0425783
               year_13 |  -.0343536   .0433686    -0.79   0.428    -.1193546    .0506473
               year_14 |  -.0248547    .041068    -0.61   0.545    -.1053465    .0556371
               year_15 |  -.0188818    .039707    -0.48   0.634    -.0967062    .0589426
               year_16 |  -.0153306   .0391906    -0.39   0.696    -.0921427    .0614815
               year_17 |  -.0105907   .0387608    -0.27   0.785    -.0865605    .0653791
               year_18 |  -.0085547   .0389603    -0.22   0.826    -.0849155    .0678061
               year_19 |  -.0075817   .0398676    -0.19   0.849    -.0857207    .0705573
               year_20 |  -.0030358   .0400476    -0.08   0.940    -.0815278    .0754561
                lndist |  -.1561562   .0309894    -5.04   0.000    -.2168942   -.0954182
              lnGDPimp |    .109938    .041569     2.64   0.008     .0284643    .1914116
              lnGDPexp |   .0555364   .0383281     1.45   0.147    -.0195854    .1306582
                 _cons |   2.462663   .4698884     5.24   0.000     1.541698    3.383627
          ------------------------------------------------------------------------------

          Also, I want to ask you if there is an option in the ppmlhdfe command that could help me produce the coefficient for the distance variable since it is one important variable in the gravity model. I do understand that distance is directly correlated to the fixed effects such as importer, exporter, or country-pair but the ppml command with dummy variables showed a negative coefficient for distance, which is good even though it is not as high as I would hope. However, I expect this might only be due to my sample being too small.

          Thank you very much and I apologize for the long reply.




          Comment


          • #20
            Hi Long Nguyenn,

            Yes sorry that was my mistake. My point was that Stata drops collinear regressors from right to left. It looks to me that ln_dist is perfectly collinear with your fixed effects, but ppml is failing to detect it for some reason. You can tell that the two sets of results are actually the same; look at the estimates for all the other coefficients and look at the deviance. Here is one way to test this: run

            Code:
            reghdfe lndist, a(importer exporter year)
            If I am right, the R^2 from this regression should be 1. That would confirm that lndist is perfectly predicted by your fixed effects.

            Actually now that you mention the dependent variable, is that that *log* of the variable "volume" you are using? If so, that would be incorrect. To use ppml, you should estimate the dependent variables in levels, not in logs.

            Sorry again about the confusion. Let me know if the above clarifies.

            Regards,
            Tom

            Comment


            • #21
              Dear Tom Zylkin,

              You are right, the R2 from the reghdfe command is exactly 1. Thank you for your clarification. I really appreciate your help here and from all the threads that I follow.

              On another note, I have tried ppmlhdfe command again with levels of volume instead of logs and produced these following results:

              Code:
              ppmlhdfe volume lndist lnGDPimp lnGDPexp, a(importer exporter year) cluster(lndist)
              Code:
              note: 1 variable omitted because of collinearity: lndist
              HDFE PPML regression                              No. of obs      =        126
              Absorbing 3 HDFE groups                           Residual df     =          2
              Statistics robust to heteroskedasticity           Wald chi2(2)    =      10.62
              Deviance             =   284791.733               Prob > chi2     =     0.0049
              Log pseudolikelihood = -143119.9672               Pseudo R2       =     0.9850
              
              Number of clusters (lndist) =          3
                                               (Std. Err. adjusted for 3 clusters in lndist)
              ------------------------------------------------------------------------------
                           |               Robust
                    volume |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                    lndist |          0  (omitted)
                  lnGDPimp |   1.700467   1.637339     1.04   0.299    -1.508658    4.909592
                  lnGDPexp |   1.137436   1.678489     0.68   0.498    -2.152341    4.427214
                     _cons |  -13.31204    29.5915    -0.45   0.653    -71.31033    44.68624
              ------------------------------------------------------------------------------
              Code:
              . ppmlhdfe volume lndist lnGDPimp lnGDPexp, a(importer exporter year) cluster(pairid)
              Code:
              HDFE PPML regression                              No. of obs      =        126
              Absorbing 3 HDFE groups                           Residual df     =          5
              Statistics robust to heteroskedasticity           Wald chi2(2)    =       5.16
              Deviance             =   284791.733               Prob > chi2     =     0.0759
              Log pseudolikelihood = -143119.9672               Pseudo R2       =     0.9850
              
              Number of clusters (pairid) =          6
                                               (Std. Err. adjusted for 6 clusters in pairid)
              ------------------------------------------------------------------------------
                           |               Robust
                    volume |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                    lndist |          0  (omitted)
                  lnGDPimp |   1.700467   1.180878     1.44   0.150     -.614011    4.014945
                  lnGDPexp |   1.137436   1.198446     0.95   0.343    -1.211474    3.486346
                     _cons |  -13.31204   21.10152    -0.63   0.528    -54.67026    28.04617
              ------------------------------------------------------------------------------
              Code:
              ppmlhdfe volume lndist lnGDPimp lnGDPexp, a(importer exporter year) vce(robust)
              Code:
              HDFE PPML regression                              No. of obs      =        126
              Absorbing 3 HDFE groups                           Residual df     =         99
                                                                Wald chi2(2)    =      57.50
              Deviance             =   284791.733               Prob > chi2     =     0.0000
              Log pseudolikelihood = -143119.9672               Pseudo R2       =     0.9850
              ------------------------------------------------------------------------------
                           |               Robust
                    volume |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                    lndist |          0  (omitted)
                  lnGDPimp |   1.700467   .3190902     5.33   0.000     1.075062    2.325872
                  lnGDPexp |   1.137436   .3222978     3.53   0.000     .5057441    1.769128
                     _cons |  -13.31204   5.673695    -2.35   0.019    -24.43228   -2.191806
              ------------------------------------------------------------------------------
              To my understanding, vce(robust) should give me the same result as cluster(panel variables) as this is stated from the xtreg help files. However, even though my coefficients the same, the SEs are not quite.

              Could you help me explain what caused these differences and recommend which one should I used?

              Comment


              • #22
                Hi Long Nguyenn,

                The help file for xtreg only applies to that command. vce(robust) should give you the same standard errors as ppml using the "robust" command. You can confirm from one of your earlier posts that this is indeed the case.

                To give a more thorough answer, vce(robust) assumes the errors for all observations are heteroscedastic but uncorrelated with one another. Cluster(pair_id) assumes that they are instead correlated (or "clustered") within observations that share the same pair_id.

                Edited to add: in answer to your last question, since you have multiple years clustering by pair makes more sense here.

                I hope that helps!

                Regards,
                Tom
                Last edited by Tom Zylkin; 11 Jun 2020, 14:22.

                Comment


                • #23
                  Dear Tom Zylkin,

                  Your explanation is perfect! Thank you for your help.

                  Best,
                  Long

                  Comment


                  • #24
                    Dear Tom Zylkin,

                    If you have time, I would like to ask you some more questions.

                    I am working on a dataset that consists of one country export to 15 other countries from 1998 to 2018. My general code is the following:

                    Code:
                    ppmlhdfe export lnimpgdp lnexpgdp lnexrate lnimpopen lnexpopen, a(pairid) cluster(pairid)
                    however, I am a little confused about how to use the absorb option effectively. Since I am working with a 1xN dataset, should I also be absorbing exporter fixed effects as well as year? My intuition is that since I only have one exporter, variables such as exporter gdp and exporter trade openness might be explained by the exporter fixed effects and therefore be omitted. However, these variables are time-varying so I am not sure. Could you give me a general rule of thumbs of when to use exporter fixed effects, country/time fixed effects?

                    Thank you very much

                    Comment


                    • #25
                      Hi Long Nguyen,

                      If you only have 1 exporter, there are a couple things to keep in mind wrt to the fixed effects:
                      - there is no need to have an exporter fixed effect. The dummy associated with this fixed effect would be equal to 1 for all observations. In other words, it's just a constant.
                      - the importer fixed effect is the same thing as a pair fixed effect, since the importer is the only thing that is different for each pair.
                      - If a variable varies by both country and time, it will not be absorbed by a country fixed effect. It will, however, be absorbed by a country-time fixed effect.
                      - In general, say you want to estimate a variable that varies by i and t. It would make sense to have (separate) i and t fixed effects. Alternatively, you might have data that varies by i, j, t, as in trade data (exporter, importer, and time). In that case, it makes sense to have it, jt, and ij fixed effects, though less strict approaches could instead have i, j, and t fixed effects, it and jt fixed effects, ij and t fixed effects, and so on. Choosing between these alternatives often depends on what variable you are trying to identify, as explained in the previous point.

                      In your case, it is most useful to think of your data as varying by importer and time (forget exporter). Therefore, use two fixed effects: importer and time.

                      Regards,
                      Tom

                      Comment


                      • #26
                        Dear Tom Zylkin,

                        Thank you for your recommendation. I followed and get this result:

                        Code:
                        ppmlhdfe export lnIMPgdp lnExrate lnIMPopen, a(importer_* year_*) vce(cluster pairid)
                        Code:
                        HDFE PPML regression                              No. of obs      =        315
                        Absorbing 36 HDFE groups                          Residual df     =         14
                        Statistics robust to heteroskedasticity           Wald chi2(3)    =       6.92
                        Deviance             =  64242.15402               Prob > chi2     =     0.0746
                        Log pseudolikelihood =  -33538.1381               Pseudo R2       =     0.9675
                        
                        Number of clusters (pairid) =         15
                                                        (Std. Err. adjusted for 15 clusters in pairid)
                        ------------------------------------------------------------------------------
                                     |               Robust
                              export |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                            lnIMPgdp |   .4813494   .3541968     1.36   0.174    -.2128636    1.175562
                            lnExrate |  -.1700355   .1213745    -1.40   0.161    -.4079251    .0678541
                           lnIMPopen |   .0531632   .6839801     0.08   0.938    -1.287413     1.39374
                               _cons |   6.296457   5.834532     1.08   0.281    -5.139015    17.73193
                        ------------------------------------------------------------------------------

                        I also tried some other fixed effects such as

                        Code:
                        ppmlhdfe export lnIMPgdp lnExrate lnIMPopen, a(importer_*#year_*) vce(cluster pairid)
                        Code:
                        HDFE PPML regression                              No. of obs      =        314
                        Absorbing 35 HDFE groups                          Residual df     =         14
                        Statistics robust to heteroskedasticity           Wald chi2(3)    =       6.20
                        Deviance             =  63270.63511               Prob > chi2     =     0.1024
                        Log pseudolikelihood = -33048.38428               Pseudo R2       =     0.9679
                        
                        Number of clusters (pairid) =         15
                                                        (Std. Err. adjusted for 15 clusters in pairid)
                        ------------------------------------------------------------------------------
                                     |               Robust
                              export |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                            lnIMPgdp |   .5002167   .3477729     1.44   0.150    -.1814057    1.181839
                            lnExrate |  -.1299062   .1181805    -1.10   0.272    -.3615357    .1017232
                           lnIMPopen |   .1091948     .66577     0.16   0.870     -1.19569     1.41408
                               _cons |   5.604361   5.706363     0.98   0.326    -5.579904    16.78863
                        ------------------------------------------------------------------------------
                        Code:
                        ppmlhdfe export lnIMPgdp lnExrate lnIMPopen, a(importer_*#year) vce(cluster pairid)
                        Code:
                        HDFE PPML regression                              No. of obs      =        294
                        Absorbing 15 HDFE groups                          Residual df     =         13
                        Statistics robust to heteroskedasticity           Wald chi2(3)    =       6.08
                        Deviance             =  55401.08588               Prob > chi2     =     0.1077
                        Log pseudolikelihood = -29004.27896               Pseudo R2       =     0.9555
                        
                        Number of clusters (pairid) =         14
                                                        (Std. Err. adjusted for 14 clusters in pairid)
                        ------------------------------------------------------------------------------
                                     |               Robust
                              export |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                            lnIMPgdp |   .6151594   .3405068     1.81   0.071    -.0522216     1.28254
                            lnExrate |   -.091259   .1112315    -0.82   0.412    -.3092687    .1267508
                           lnIMPopen |   .3786789   .6052406     0.63   0.532    -.8075708    1.564929
                               _cons |    3.12602   5.274238     0.59   0.553    -7.211297    13.46334
                        ------------------------------------------------------------------------------
                        These are two very different results from my first ppmlhdfe command and there is only one coefficient that is statistically significant. As the trade volume varies by both the importer countries and time, my intuition is that a(importer_* year_*), a(importer_*#year), and a(importer_*#year_*) should all give me the same results. However, this is not true as seen above. Could you help me explain what is going on wrong? I truly appreciate your help!

                        Comment


                        • #27
                          Hi Long Nguyenn,

                          You are not using the absorb option correctly in these examples. You were using it correctly before. Check some of your earlier posts.

                          Regards,
                          Tom

                          Comment


                          • #28
                            Dear Tom Zylkin,

                            Thank you very much!

                            Comment


                            • #29
                              Dear Tom Zylkin,

                              Sorry to bother you again but I ran into some more problems. I would very much appreciate it if you could help me.

                              I am working with a sample consists of 1 exporter country (Vietnam) and 30 importer countries from 1998 to 2018. My independent variables are exporter's and importer's GDPs, exporter's and importer's trade openness (% of GDP), and real effective exchange rate (data from https://www.bruegel.org/publications...-new-database/.)

                              My code is as follow:

                              Code:
                              ppmlhdfe exportvolume lnExrate lnEXPgdp lnIMPgdp lnEXPopen lnIMPopen, a(importer year) vce(cluster pairid)
                              Code:
                              note: 2 variables omitted because of collinearity: lnEXPgdp lnEXPopen
                              HDFE PPML regression                              No. of obs      =        609
                              Absorbing 2 HDFE groups                           Residual df     =         28
                              Statistics robust to heteroskedasticity           Wald chi2(3)    =       9.51
                              Deviance             =  118024.1806               Prob > chi2     =     0.0232
                              Log pseudolikelihood = -61627.18238               Pseudo R2       =     0.9566
                              
                              Number of clusters (pairid) =         29
                                                              (Std. Err. adjusted for 29 clusters in pairid)
                              ------------------------------------------------------------------------------
                                           |               Robust
                              exportvolume |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                  lnExrate |  -.8901078   .3309642    -2.69   0.007    -1.538786   -.2414299
                                  lnEXPgdp |          0  (omitted)
                                  lnIMPgdp |   .3200044   .2898905     1.10   0.270    -.2481706    .8881794
                                 lnEXPopen |          0  (omitted)
                                 lnIMPopen |   .4969847   .3851513     1.29   0.197    -.2578979    1.251867
                                     _cons |   2.171014    4.67038     0.46   0.642    -6.982762    11.32479
                              ------------------------------------------------------------------------------
                              
                              Absorbed degrees of freedom:
                              -----------------------------------------------------+
                               Absorbed FE | Categories  - Redundant  = Num. Coefs |
                              -------------+---------------------------------------|
                                  importer |        29          29           0    *|
                                      year |        21           0          21     |
                              -----------------------------------------------------+
                              * = FE nested within cluster; treated as redundant for DoF computation

                              My questions are:

                              1) The variables that are related to my exporter country are omitted. Is there a way for me to show these variables? I understand that this is due to the fixed effects as these variables are the same across country-pair. However, I do want to use it to explain the trade activity of a specific exporter.

                              2) Can you help me check my real effective exchange rate variable? This is an index number calculated as the cpi-based weighted average of a currency against a basket of currency. I was thinking I could incorporate this variable as the exchange rate does has an impact on trade. I calculate the REER by dividing the REER index number of Vietnam by the REER index number of a partner country. For example, in 2017 the REER for Vietnam is 104 and for Austria is 108, I would calculate the REER of Vietnam/Austria by 104/108 = 0.96.

                              I suspect the REER to have a positive impact on export volume because of how the depreciation of a currency, which means the REER rises, will result in export being cheaper, thus export will rise as well. However, the coefficient for REER in my model is significantly negative.

                              3) For those variables that did have coefficients, they are not statistically significant even though I have increased the sample size from last time. Do you think my sample is still too small to produce a significant result? If not then what seems to be the reason and how may I improve it?
                              Last edited by Long Nguyenn; 15 Jun 2020, 18:42.

                              Comment


                              • #30
                                Hi Long Nguyenn,

                                1) Because you only have one exporter, any exporter-year effects are absorbed by the year fixed effect. You could drop the year fixed effect but then you would not be accounting for omitted factors that vary by year. An important example is the value of the units your dependent variable is measured in. Assuming there is inflation over time, you will always have more trade in nominal terms in later periods than in earlier ones. A time fixed effect takes care of this. If you don't have a time fixed effect, at minimum you need to make sure your dependent variable is measured in real terms rather than in nominal terms. Likely you would need to deflate it.

                                I do have a question I would ask at this point, though. What is the sense in regressing Vietnam's exports on its openness to trade? Unless you are only using exports from a particular sector that you are focusing on, it seems to me the two things are related mechanically. I can understand why you do it for the importer because you can't use importer-year fixed effects. But there it seems like it should function as a control rather than as a variable of interest.

                                2) Sorry, your explanation of how you calculated the RER variable is a bit confusing. I agree that if Austria's RER depreciates, one would expect its imports from Vietnam to go down. Are your results implying the opposite? One does not often see the RER included as a variable in this context. Thus I can only offer limited advice. One thing to be aware of is that the RER is potentially endogenous to trade. If Austria's RER depreciates because of higher relative demand for Vietnamese-produced goods, one would observe the opposite of the expected sign.

                                3) Yes sample size could be an issue. You only have 28 clusters, which is relatively small.

                                Regards,
                                Tom
                                Last edited by Tom Zylkin; 15 Jun 2020, 19:40.

                                Comment

                                Working...
                                X